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and North Carolina State University 

Structural equation models are multivariate statistical models 
that are defined by specifying noisy functional relationships among 
random variables. We consider the classical case of linear relation- 
ships and additive Gaussian noise terms. We give a necessary and 
sufficient condition for global identifiability of the model in terms 
of a mixed graph encoding the linear structural equations and the 
correlation structure of the error terms. Global identifiability is un- 
derstood to mean injectivity of the parametrization of the model and 
is fundamental in particular for applicability of standard statistical 
methodology. 

1. Introduction. A mixed graph is a triple G = {V, D, B) where V is 
a finite set of nodes and D, B Q V x V are two sets of edges. The edges 
in D are directed, that is, G D does not imply G D. We denote 
and draw such an edge as i^ j. The edges in B have no orientation; they 
satisfy {i,j) £ B ii and only if {j,i) G B. Following tradition in the field, we 
refer to these edges as bidirected and denote and draw them as i -f-)- j . (In 
figures, we will draw bidirected edges also as dashed edges for better visual 
distinction.) We emphasize that in this setup the bidirected part (y,B) is 
always a simple graph, that is, at most one bidirected edge may join a pair 
of nodes. Moreover, neither the bidirected part {V, B) or the directed {V, D) 
contain self-loops, that is, ^ D L) B for all i €V. In the main part of 
this work, the considered mixed graphs are acyclic, which means that the 
directed part {V, D) is a directed graph without directed cycles. 
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Enumerate the vertex set as V = [m] := {1, . . . ,m}. Let R be the set of 
matrices A = {Xij) G M'^x'" with Xij = if i is not in D. Write R^g for 
the subset of matrices A € for which / — A is invertible, where / denotes 
the identity matrix. Let PD{m) be the cone of positive definite mxm matri- 
ces. Define PD{B) to be the set of matrices = (wjj) G PD{m) with cjy = 
if i 7^ J and i j is not an edge in B. Write N'mifJ', 5]) for the multivariate 
normal distribution with mean n S and covariance matrix S. 

Definition 1. The linear structural equation model A4{G) associated 
with an acyclic mixed graph G = {V, D, B) is the family of multivariate 
normal distributions 7Vm.(0,S) with 

S = (/-A)-^J7(/-A)-i 

for AGM^g andOGPD(S). 

The set of parents of a node i, denoted pa(i), comprises the nodes j with 
J — >■ z in D. The graphical model just defined is most naturally motivated in 
terms of a system of linear structural equations: 

(1.1) Yj= ^ XijYi + Ej, j = l,...,m. 

iGpa(j) 

If e = (ei, . . . ,em) is a random vector following the multivariate normal dis- 
tribution J\f{0,O.) and A € M^g, then the random vector Y = [Yi,... ,Ym) 
is well defined as a solution to the equation system in (1.1) and follows 
a centered multivariate normal distribution with covariance matrix (/ — 
A)-^!^(/-A)-^ 

Remark 1 . Assuming centered distributions presents no loss of general- 
ity. An arbitrary mean vector could be incorporated by adding an intercept 
constant Ajo to each equation in (1.1). The results discussed below would 
apply unchanged. 

Linear structural equation models are ubiquitous in many applied fields, 
most notably in the social sciences where the models have a long tradi- 
tion. Recent renewed interest in the models stems from their causal inter- 
pretability; compare [11, 13]. While current research is often concerned with 
non-Gaussian generalizations of the models, there remain important open 
problems about the linear Gaussian models from Definition 1. These include 
the following fundamental problem, which concerns the global identifiability 
of the model parameters. 
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Fig. 1. Acyclic mixed graph inducing a singular model. 

Question 1. For which mixed graphs G = {V, D, B) is the rational para- 
metrization 

■■ (A, Ay^n{i - A)^^ 

an injective map from R^g x PD{B) to the positive definite cone PD{m)1 

According to our first theorem, proven later on in Section 7, we can restrict 
attention to acyclic mixed graphs. 

Theorem 1. If G is a mixed graph for which the parametrization (pc is 
injective, then G is acyclic. 

The nodes of an acyclic mixed graph G = {V, D, B) can be ordered topo- 
logically such that i ^ j (z D only if i < j. Under a topological ordering of 
the nodes, all matrices in are strictly upper-triangular. Hence, M^g = 
because det(/ — A) = 1 for all A € M^. Moreover, the parametrization is 
a polynomial map in the entries of A and ft when G is acyclic. 

Characterizing the graphs with injective parametrization is important be- 
cause failure of injectivity can lead to failure of standard statistical methods. 
We briefly exemplify this issue for the models considered here and point the 
reader to [7] and references therein for a more detailed discussion. Briefly 
put, the problem is due to the fact that failure of injectivity can result in 
parameter spaces that are not smooth manifolds; compare in particular the 
examples in Section 1 of [7]. 

Example 1. Consider the graph G= {V,D,B) from Figure 1. Let A = 
(A.jj) be the matrix in with 

Ai2 = 3, A23 = — ^, A34 = A45 = 1. 

Let O = (uiij) be the matrix in PD{B) with all diagonal entries equal to 2 
and 



UJu — Wis — — W35 — 1. 
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(c) (d) 
Fig. 2. Histograms of p-values for a likelihood ratio test. 
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It can be shown that at the specified point (A, fi) the map (j)G is not injec- 
tive and the image of 4>g has a singularity. Suppose we use the hkehhood 
ratio test for testing the model Ai{G) against the saturated alternative 
given by all multivariate normal distributions on M™. The standard proce- 
dure would compare the resulting likelihood ratio statistic to a chi-square 
distribution with two degrees of freedom. Figure 2 illustrates the problems 
with this procedure. What is plotted are histograms of p-values obtained 
from the chi-square approximation. Each histogram is based on simulation 
of 20,000 samples of size n = 100 or n = 1000. The samples underlying the 
two histograms in Figure 2(a), (b) are drawn from the multivariate normal 
distribution with covariance matrix S = (/)g(A, 0) for the above parameter 
choices. Many p-values being large, it is evident that the test is too con- 
servative. For comparison, we repeat the simulations with A23 = 1/2 and all 
other parameters unchanged. There is no identifiability failure in this second 
scenario, the image of (pc is smooth in a neighborhood of the new covariance 
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matrix and, as shown in Figure 2(c), (d), the expected uniform distribution 
for the values emerges in reasonable approximation. 

Call a directed graph with at least two nodes an arborescence converging 
to node i if its edges form a spanning tree with a directed path from any 
node J 7^ i to i. In other words, i is the unique sink node. For a mixed graph 
G = (y, D, B) and a subset of nodes AcV ,\et Da = Dr\{Ax A) be the set 
of directed edges with both endpoints in A. Similarly, let Ba = Br\{Ayi A), 
and define the mixed subgraph induced by A to be Ga = {A,Da-,Ba)- Our 
main result provides the following answer to Question 1. 

Theorem 2 . The parametrization (pc for an acyclic mixed graph G = 
{V,D,B) fails to be injective if and only if there is an induced subgraph Ga, 
ACV , whose directed part (A, Da) contains a converging arborescence and 
whose bidirected part {A,Ba) is connected. If (pc is injective, then its inverse 
is a rational map. 

An acyclic mixed graph G = {V,D,B) is simple if there is at most one 
edge between any pair of nodes, that is, if D f] B = . Theorem 2 states 
in particular that only simple acyclic mixed graphs may have an injective 
parametrization. Indeed, two edges i o j and i— t-j, respectively, connect 
and yield an arborescence in the subgraph Gjj j}. 

Corollary 1. // the acyclic mixed graph G has at most three nodes, 
then 4>G is injective if and only ifG is simple. There are exactly two unlabeled 
simple acyclic mixed graphs on four nodes with (pc not injective. 

Proof. An arborescence involving three nodes contains two edges. The 
bidirected part of a simple mixed graph can only be connected if there are 
two further edges. However, a simple graph with three nodes has at most 
three edges. The two examples on four nodes are shown in Figure 3. □ 

A possibly cyclic mixed graph G = {y,D,B) is simple if there is at most 
one edge between any pair of nodes, that is, if -D PI i? = and the presence of 
an edge i ^ j in D implies the absence of j — )■ i. As shown in the next lemma, 
it is easy to give a direct proof of the fact that only simple graphs can have 



Fig. 3. The two unlabeled graphs on four nodes with noninjective parametrization. 
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an injective parametrization. The lemma also clarifies that noninjectivity 
can be recognized in subgraphs, which is a fact that is important for later 
proofs. 

Lemma 1. Suppose the map (pc given by a mixed graph G is injective. 
Then G is simple, and (pn is injective for any (not necessarily induced) 
subgraph H of G. 

Proof, li H = {V ,D\B') is a subgraph of G= 5), that is, C 

V, D' C D and B' C B, then (pn is injective if and only if (pc is injective at 
points that have all parameters Xij and coij zero for edges (i, j) G D\D' or 
{i,j) & B \ B' . If G is not simple, then there exist two distinct indices i,j 
for which the graph contains at least two of the three possible edges i^ j, 
j ^ i and i o j. If y = {i, j}, then <pG is not injective because it maps the at 
least 4-dimensional set M^g x PD{B) to the 3-dimensional cone of positive 
definite 2x2 matrices. If \V\ > 2, then the claim follows by passing to the 
subgraph induced by {i,j}- □ 

The remainder of the paper is organized as follows. Section 2 reviews the 
connection of our work to the existing literature on identifiability of struc- 
tural equation models. Section 3 lays out the natural stepwise approach to 
inversion of the parametrization (pQ in the case where the underlying graph 
is acyclic. Necessity and sufficiency of the graphical condition from our main 
Theorem 2 are proven in Sections 4 and 5, respectively. In Section 6, we col- 
lect three lemmas used in the proof of sufficiency. Theorem 1 about directed 
cycles is proven in Section 7. Concluding remarks are given in Section 8. 

2. Prior work. Identifiability properties of structural equation models 
are a topic with a long history. A review of classical conditions, which do 
not take into account the finer graphical structure considered here, can be 
found, for instance, in the monograph [2]. A more recent sufficient condi- 
tion for global identifiability of the linear structural equation models from 
Definition 1 is due to [9, 12]. It requires the presence of a bidirected edge 
i j to imply the absence of directed paths from j to i (and from i to j). 
Following [12], we call an acyclic mixed graph with this property ancestral. 
It is clear that an ancestral mixed graph is simple. We revisit the result 
about ancestral graphs in Corollary 2 below. 

Other recent work, such as [3], considers a weaker identifiability require- 
ment for the model A4{G) associated with a mixed graph G = {V, D, B). For 
a pair of matrices Aq € M^g and VLq G PD{B), define the fiber 

(2.1) TiAoM = {(A, n) : cPg{A, n) = cPg{^oM,A G M^g, n G PDiB)}. 
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The map (pc is injective if and only if all its fibers contain only a single point. 
If it holds instead that for generic choices of A G M^g and G PD{B), the 
fiber J^(A, contains only the single point (A, fi), then we say that the 
map (pG is generically injective and the model A4{G) is generically iden- 
tifiable. Requiring a condition to hold for generic points means that the 
points at which the condition fails form a lower-dimensional algebraic sub- 
set. In particular, the condition holds for almost every point (in Lebesgue 
measure), and some authors thus also speak of an almost everywhere iden- 
tifiable model; compare the lemma in [10]. When the substantive interest 
is in all parameters of a model, generic identifiability constitutes a minimal 
requirement. However, generically but not globally identifiable models can 
have nonsmooth parameter spaces and thus present difficulties for statistical 
inference; recall Example 1 that treats a generically identifiable model. 

The main theorem of [3], which we reprove in Corollary 3, states that (pc 
is generically injective for every simple acyclic mixed graph G. The graph 
being simple and acyclic, however, is far from necessary for generic injectivity 
of (pG- A classical counterexample is the instrumental variable model based 
on the graph with edges 1 — )• 2 — >• 3 and 2 o 3. Cyclic models may also be 
generically identifiable; for instance, see Example 3.6 in [7]. For recent work 
on the topic, see [16] and references therein. To our knowledge, characterizing 
the mixed graphs G with generically injective parametrization pc remains 
an open problem. 

The linear structural equation models M{G) considered in this paper are 
closely related to latent variable models known as semi-Markovian causal 
models. These nonparametric models are obtained by subdividing the bidi- 
rected edges, that is, each edge i j is replaced by two directed edges 
i ^ Uij j, where Uij is a new node. Each node Uij added to the vertex set 
corresponds to a latent variable; compare also [11, 12, 17]. Using results from 
[15], the work of [14] gives graphical conditions for when (univariate or mul- 
tivariate) intervention distributions in acyclic semi-Markovian causal models 
are identified. This work is based on manipulating recursive density factor- 
izations involving latent variables. If G is an acyclic mixed graph and the 
structural equation model Ai{G) is contained in the semi-Markovian model 
for G, then Ai{G) is globally identified provided that in the semi-Markovian 
model we can identify, for every node i, the univariate intervention distri- 
bution for i and intervention set pa(i); see also Chapter 6 in [15]. 

For an acyclic mixed graph G = {V,D,B), we may define a Gaussian 
model M'{G) by assuming that both the observed and the latent variables 
in the semi-Markovian model for G have a joint multivariate normal dis- 
tribution. This creates an explicit connection to linear structural equation 
models, and it is indeed possible that Ai'{G) = A4{G). For instance, if there 
are no directed edges {D = 0), then Ai'{G) = Ai{G) if and only if the bidi- 
rected part {V,B) is a forest of trees; see Corollary 3.4 in [8]. li D = and 
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{V,B) is not a forest of trees, then Ai{G) is strictly larger than Ai'{G). 
Therefore, other nonnormal constructions would be required in order for 
the theorems in [14] to furnish sufficient conditions for global identifiability 
of linear structural equation models. We are unaware, however, of litera- 
ture providing a connection between semi-Markovian causal models and the 
linear structural equation models from Definition 1 when non-Gaussian dis- 
tributions are assumed for the latent variables. 

Finally, the existing counterexamples to identifiability of semi-Markovian 
models involve binary variables and thus cannot be used to prove necessity of 
an identifiability condition for the Gaussian models M{G). However, despite 
this fact and the difficulties in relating the models A4{G) to semi-Markovian 
models, our graphical condition from Theorem 2, which we first found by 
experimentation with computer algebra software, coincides with that of [14]; 
the term "y-rooted C-tree" is used there to refer to a mixed graph whose 
directed part is an arborescence converging to node y and whose bidirected 
part is a tree. A reader familiar with the work in [15] will also recognize 
similarities between the higher-level structure of the proofs given there and 
those in Section 5 of this paper. 

3. Stepwise inversion. Throughout this section, suppose that G= (V, D, B) 
is an acyclic mixed graph with vertex set V = [m]. The map (pc is injective 
if all its fibers contain only a single point; recall the definition of a fiber 
in (2.1). Let S = (j)Gi^o,^o) for two matrices Aq G and f^o e PD{B). 
This section describes how to find points (A, 0) in the fiber J^(Ao,Oo). In 
particular, we show in Lemma 2 that an algebraic criterion can be used 
to decide whether the map 0g is injective. The lemma is proven after we 
describe a natural inversion approach that uses the acyclic structure of the 
graph G in a stepwise manner. We remark that this stepwise inversion is 
closely related to the idea of pseudo- variable regression used in the iterative 
conditional fitting algorithm of [6]. 

For each i < m — 1, let P{i) = pa(i + 1) be the parents of node z + 1, and 
S{i) = {j < i : J -H- i + 1 G B} the siblings of i + 1. (In other related work, the 
nodes incident to a bidirected edge i -H- j have also been called "spouses" 
of each other but we find "siblings" to be natural terminology given that 
a common parent to the two nodes is introduced when subdividing the edge 
as discussed in Section 2.) 

Lemma 2. Suppose G = {V,D,B) is an acyclic mixed graph with its 
nodes labeled in a topological order. Then the parametrization tpc is injective 
if and only if the rank condition 

rank(0[i]\5(i)^[i](/ - J^)^^p^^^) = \P{i)\ 
holds for all nodes i = 1, . . . ,m — 1 and all pairs A € and Q € PD{B) . 
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Remark 2. In this paper, matrix inversion is always given higher pri- 
ority than an operation of forming a submatrix. For any invertible matrix 
M and index sets A,B, the matrix Mj^^ = {M~^)a,b is thus the A x B 
submatrix of the inverse of M. 



Computing points (A,0) in the fiber J^{Aq,Qq) means solving the poly- 
nomial equation system given by the matrix equation 

(3.1) j: = {i -A)~^n{i -A)-\ 

For topologically ordered nodes, (3.1) implies that an = cou and that the 
first column in the strictly upper-triangular matrix A contains only zeros. 
Hence, these are uniquely determined for all matrices in the fiber. 

Let i > 1, and assume that we know the [i] x [i] submatrices of A and Q 
of a solution to equation (3.1). Partition off the {i + l)st row and column of 
the submatrices 

The matrices F and ^' are known, A[j]\p(j) = and uj[ij\s{i) = 0. The inverse 
of / — A can be written as a block matrix as 

In this notation, the part of equation (3.1) that pertains to the [i + 1] x 
[i + 1] submatrix of T, is 

r-^^F-i F-^^F-iA + F-^w 

oJi+iA+i + A^F-^^-F-iA + 2l<j^F-1A 

where only the upper-triangular parts of the symmetric matrices are shown. 
Hence, given the values of F and ^, the choice of A and to is unique if and 
only if the equation 

(3.3) S[,],|i+i}=F-^^F-i.A + F-^-a; 

has a unique solution. Clearly, any feasible choice of a solution (A,cj) to the 
equation in (3.3) leads to a unique solution Wj+i^j+i via the equation 

(3.4) = uji+i,i+i + X^T-^^T-^X + 2oo^T-^X. 

Since A[j]\p(j) = and a;[j]\5(j) = 0, equation (3.3) can be rewritten as 

= (F-^^-F-^p^.p • Ap(i) + i'^smi])'^ ■ '^Sii)- 
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It has a unique solution if and only if the matrix 



has full column rank |-P(i)| + |5'(i)|. The matrix T is invertible because it is 
upper-triangular with ones along the diagonal. Thus, the condition is equiv- 
alent to 



i],Pii) 



having full column rank. The second block is part of an identity matrix. We 
deduce that the condition is equivalent to requiring that ^'[j]\^5(j) [j]r|Tj-^p^.^, 
the submatrix obtained by removing the rows and columns with index in 
S{i), has rank |-P(i)|. Note that 



^[i]\5(i),Hr[i]^P(i) 



is the matrix appearing in Lemma 2. 



Proof of Lemma 2. Consider a feasible pair (A, O). If the rank con- 
dition for this pair holds for all nodes i = l,...,m — 1, then it follows from 
the stepwise inversion procedure described above that the fiber J- {A, 0,) con- 
tains only the single point (A, Q). Therefore, the rank condition holding for 
all nodes and all matrix pairs implies that all fibers are singletons, or in 
other words, that the map (j)c is injective. 

Conversely, assume that the rank condition fails for some node i <m — 1 
and matrix pair (A,0). If i = m — 1, then the considered fiber J^(A, $7) is 
positive-dimensional, and (pc not injective. If z < m — 1, then it follows anal- 
ogously that the parametrization (j)H for the induced subgraph H = G^j+ij 
is not injective. By Lemma 1, (pc cannot be injective either. □ 

If the rank condition in Lemma 2 holds at a particular pair (A,r2), then 
the fiber J- {A, 0.) contains only the pair (A, fi). However, the converse is false 
in general, that is, failure of the rank condition at a particular pair (A, $7) 
and vertex i <m need not imply that the fiber J^{A, il.) contains more than 
one point. This may occur even for a simple acyclic mixed graph. 



Example 2. Consider the graph in Figure 4, set A12 = A23 
and choose the positive definite matrix 



A, 
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The rank condition for this pair (A, 17) fails at node i = 3. Nevertheless, the 
fiber T{A,Q) is equal to {(A, O)}. If we set wis = 0, however, then T(A,Q) 
becomes one-dimensional. Using terminology from econometrics/causality, 
the variable corresponding to node 5 behaves like an "instrument;" compare, 
for instance, [11]. 

Lemma 2 allows us to give simple proofs of two established results in 
the graphical models literature. The proof of Corollary 2 emphasizes the 
special structure exhibited by ancestral graphs. The proof of Corollary 3 
demonstrates that the identity matrix always has a singleton as a fiber under 
the parametrization associated with a simple acyclic mixed graph. 

Corollary 2. If the acyclic mixed graph G is ancestral then the parame- 
trization (pG is injective. 

Proof. Recall that if G = (V, D, B) is ancestral and i -H- j is a bidirected 
edge in G, then there is no directed path from i to j or j to i. Suppose 
V = \m] is topologically ordered, and let i be some node smaller than m. 
Pick a node j G S{i). Then there may not exist a directed path from j to 
a node in P{i). It follows that 

The latter matrix is the product of a principal and thus positive definite 
submatrix of Vt and a matrix that contains the P{i) x P{i) identity matrix. 
It follows that this product has full column rank for all feasible pairs 

(A, $7) and all nodes i <m — 1. By Lemma 2, (pc is injective. □ 

If the acyclic mixed graph G is simple, then P{i) C [i] \ S{i) for all nodes 
i <m — l. Hence, the matrix product appearing in the rank condition always 
has at least as many rows as columns. The next generic identifiability result 
follows immediately; recall the definitions in Section 2. 

Corollary 3. If G = iV^ D, B) is a simple acyclic mixed graph, then the 
map (pG is generically injective. 
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Proof. We need to show that for generic choices of A € M and € 
PD{B), the fiber T{A,n) is equal to the singleton {(A,0)}. Set A = and 
choose O to be the identity matrix. Then each of the matrix products 

(3-5) 0[i]\5(i)ji](I- A)-|p(.^, i = l,...,m-l, 

has the identity matrix as P{i) x P{i) submatrix. The rank condition from 
Lemma 2 thus holds for all i < m — 1. Since the matrices in (3.5) have 
polynomial entries, existence of a single pair (A, fi) at which the m — 1 
matrices in (3.5) have full column rank implies that the set of pairs (A, Q) 
for which at least one of the matrices fails to have full column rank is a lower- 
dimensional algebraic set; compare [5], Chapter 9, for background on such 
algebraic arguments. □ 

In order to prepare for arguments turning the algebraic condition from 
Lemma 2 into a graphical one, we detail the structure of the inverse (/ — A)~^ 
for a matrix A = (Ajj) € M'^. Let 'P{i,j) denote the set of directed paths from 
i to j in the considered acyclic graph. 

Lemma 3. The entries of the inverse {I — A)^^ are 

(/-A)-.i= Yl Xu, i,je[m]. 

Proof. This well-known fact can be shown by induction on the matrix 
size m and using the partitioning in (3.2) under a topological ordering of 
the nodes. □ 

Note that adopting the usual definition that takes an empty sum to be 
zero and an empty product to be one, the formula in Lemma 3 states that 
(/ - A)r.i =0 if i / j and P(i,j) = 0, and it states that (/ - A)r^^ = 1 
because V{i,i) contains only a trivial path without edges. 

4. Necessity of the graphical condition for identifiability. We now prove 
that the graphical condition in Theorem 2, which states that there be no 
induced subgraph whose directed part contains a converging arborescence 
and whose bidirected part is connected, is necessary for the parametrization 
(j)G to be injective. By Lemma 1, it suffices to consider an acyclic mixed 
graph whose directed part is a converging arborescence and whose bidirected 
part is a spanning tree. In light of Lemma 2, the necessity of the graphical 
condition in Theorem 2 then follows from the following result. 
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Proposition 1. Let G= {V,D,B) be an acyclic mixed graph with topo- 
logically ordered vertex set V = [m + V\. If (V, D) is an arborescence con- 
verging to m + 1 and {V,B) is a spanning tree, then there exists a pair of 
matrices A G M-^ and G PD{B) with 

kernel(5^H\5(m),H(-f - A)[;„\p(„,)) / {0}. 

Let C{A) C M"* be the column span of (/ — A)j^j pf^^y We formulate a first 
lemma that we will use to prove Proposition 1. 

Lemma 4. IfV=[m-\-V\ and {V^D) is an arborescence converging to 
node m + 1, then the union of the linear spaces C{K) for all A G MP contains 
the set (M*)'" = (R\ {O})'" of vectors with all coordinates nonzero. 

Proof. In the arborescence, there is a unique path 7r(i) from any vertex 
i G [m] \ P[m) to the sink node m + 1. Let k[i) be the unique node in P{m) 
that lies on this path. Let A G and a G M'^^™^', and define the vector 

/3(A,a) = (/-A)[-4^(^)aGM-. 

Since the principal submatrix (/ — A)p^^^ p(m) ^® identity matrix (because 
the directed graph is a converging arborescence), f3{A,a)i = ai for all i G 
P[m). For i G [m] \ P{m), we use Lemma 3 to obtain 

(4.1) /3(A,a)i = afc(j) JJ Xji = \ij/3{A,a)j, 

where i — t- j G G is the unique edge originating from i. 

Let X be any vector in (M*)"*. Our claim states that there exist a matrix 
A G MP and vector a such that x = /3(A,a). Clearly, a has to be equal 
to the subvector Xp^jy^y The associated unique choice of A is obtained by 
recursively solving for the entries Xij using the relationship in (4.1). □ 

Let R{m) = [m] \ S{m) be the "rest" of the nodes. We are left with the 
problem of finding a matrix Q G PD{B) for which some vector in (M*)™ lies 
in the kernel of the submatrix 

^R(m),[m] = [^R(m),_R{m) ^fl(m),5{m) ] • 

Proposition 1 now follows by combining Lemma 4 with the next result. 

Lemma 5. // {V^B) is a tree on V =[m + 1], then there exists a ma- 
trix Vl G PD{B) such that the vector 1 = (1, . . . , 1)^ is in the kernel of the 
submatrix ^R(^m),[m]- 
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Proof. Let T be the set of all nodes in R{m) that are connected to some 
node in S{m) by an edge in B.liQ. ^ PD{B), then the submatrix ^R(m),s{m) 
has only zero entries in rows indexed by nodes i € R{m) \ T. If i gT, then 
the ith row of ^R{m),s{m) has at least one entry that is not constrained to 
zero and may take any real value. Hence, we can choose a matrix ^R[m),s(m) 
that has row sum 

{A9\ \^ — i if i G T, 

^^-"^^ 2^ "^'^ ~ 1 0, if i G R(m) \ T. 

Let H = (i?(m), -Br(„)) be the induced subgraph of G on vertex set R{m). 
The Laplacian of H, L{H) = {hj), is the symmetric R{m) x R{m) matrix 
whose diagonal entries are the degrees of the nodes in H and whose off- 
diagonal entries Uj are equal to — 1 if i -f-)- j is an edge in H and otherwise. 
The Laplacian is well known to be positive semidefinite with all row sums 
zero. For a subset C C [m] , let Ic G be the vector with entries equal to 
one at indices in C and zero elsewhere. The kernel of L(H) is the direct sum 
of the linear spaces spanned by the vectors Ic for the connected components 
C of the graph H; compare [4], Chapter 1. 

Let Dj- = (dij) be the diagonal matrix that has diagonal entry da = 1 if 
I G T and da = otherwise. Both L{H) and Dt are positive semidefinite 
matrices and thus the kernel of L{H) + Dt is equal to ker L{H) (IkeiDT- 
Since {V, B) is a connected graph, each connected component of H contains 
a node in T. Therefore, none of the vectors Ic are in the kernel of Dt, 
where C ranges over all connected components of H . This implies that the 
ker(L(iJ) + Dt) = {0}, and hence this matrix is positive definite. 

Let $7 be any matrix in PD[B) whose submatrix ^R{m),s{m) satisfies (4.2) 
and whose principal submatrix ^R(m),R{m) is the positive definite matrix 
L{H) + Dt- The matrix G PD{B) has the desired property because 

^R(m)\m]^ = {L{H) + Dt)1 + f^i?(m),S{m)l = It - 1t = 0. 

Such matrices exist because we can choose ^s[m),s{;m) to be, for instance, 
a diagonal matrix with very large diagonal entries. Principal minors of Vt 
that are not submatrices of ^R(m),R{m) will be dominated by these diagonal 
entries and hence be positive. All other principal minors are positive since 
^R{m),R{m) = ^i^) + was showu to be positive definite. □ 

5. Sufficiency of the graphical condition for identifiability. In this sec- 
tion, we prove that the graphical condition in Theorem 2, which requires 
an acyclic mixed graph G to have no induced subgraph whose directed part 
contains a converging arborescence and whose bidirected part is connected, 
is sufficient for the parametrization <j)G to be injective. Proposition 4 below 
shows that if (j)G is not injective and G does not contain an induced sub- 
graph with both a converging arborescence and a bidirected spanning tree. 
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then there is a subgraph G' with fewer nodes such that (j)G' stih fails to 
be injective. The sufficiency of the graphical condition then follows imme- 
diately. To see this, note that a graph G with noninjective parametrization 
(j)G must contain some minimal induced subgraph G' with noninjective (pQi . 
Applying the contrapositive of Proposition 4 to C, we conclude that the 
directed part of G' contains a converging arborescence and the bidirected 
part of G' is connected. 

In preparing for the proof of Proposition 4, we first treat the case when 
there is no arborescence; this gives Proposition 2. The case when there is no 
bidirected spanning tree is treated in Proposition 3. In either case, we reduce 
a given graph G = (F, D, B) to the subgraph Gw induced by a subset W C 
V . We use the notation A, fi, -P(i), «S'(i), V{i,j) to denote the counterparts 
to A, Q, P{i), S{i) and V{i,j), when performing this reduction of G to Gw- 

Proposition 2. Let G = {V, D, B) be an acyclic mixed graph with topo- 
logically ordered vertex set V = [m + 1], with some AeM^, nePD{B) and 
nonzero a G M)^^"^^^ , such that 

^lm]\S{mUm]{I " A)h,P(™)" = 0- 

Suppose the directed part ofG does not contain an arborescence converging to 
m + 1. Let A be the set of nodes i <m with some path of directed edges from 
i to m + 1, and W = A[J {m + 1}. Then W C^V and (pCw ^-^ ™^ injective. 

Proof. Since G does not have a converging arborescence, A C [m] and 
WCV. 

Denote the induced subgraph as Gw = {W,D,B). Let A = A^y^^y G M.^ 
and 0, = 0,w,w ^ PD{B). Note that P{m) ^ Ahy definition, and so P{m) = 
P(m). Suppose j G P(m). Then for each i G [m] \^, V{i,j) = by definition, 
and so (/ — A)~-^ = by Lemma 3. For each i €z A, and for any path i ^ 
" ' ^ '^k ^ j in G, each intermediate vertex . . . , is in j4 by definition 
of A (since there is an edge j ^ m + 1). Therefore, V{i,j) = V{i,j), and it 
follows that (/ — A)~"^ = {I — ■ III other words, when the nodes outside 
of W are removed from G, the remaining entries of (/ — A)~^ are unchanged, 
while the removed entries in the columns indexed by P{m) = P{m) are all 
zero. We obtain that 

ieA ieA 

= ^A\S{r,i),[m] (I - ^)[J,lP(m)^- 

By assumption, the last quantity is zero. By Lemma 2, (pCw is not injective. 

□ 
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We next prove a similar proposition for graphs whose bidirected part is not 
connected. The proof uses Lemmas 6 and 8, which are derived in Section 6. 

Proposition 3. Let G = {V, D, B) he an acyclic mixed graph with topo- 
logically ordered vertex set V = [m + 1], with some AgM^, QePD{B), and 
nonzero a € M)^^"^^^ , such that 



Suppose the bidirected part of G is not connected. Let A he the set of nodes 
i <m with some path of hidirected edges from i to m + 1, and W = Au {m + 
1}. Then W C^V and (pCw ^-^ '^^^ injective. 

Proof. Since the bidirected part is not connected, A C [m] and W C^V. 

Denote the induced subgraph as Gw = {W,D,B). Let A = A^y^vK S 
and O = Ovy^vi/ € PD{B). If i € S{m), then it holds trivially that i A and 
thus S{m) = S{m,). By Lemma 8 below, 

= ^A\S{m),[m] {I - A)h,p(„^)" 

- ^A\Sim),[m]\A{I " ^)H\A,P(m)«- 

By hypothesis, the first term in the last line is zero. By Lemma 6 below, 
(I — ^)[^]\^ p(m)'^ ~ 0, and so the second term in the last line is zero as 
well. Therefore, 

It remains to be shown that ap(^) 7^ 0. Suppose instead that Op^^) = 0. 
Then, using Lemma 6, we obtain that 

= - A)rV A Di ^a 

^ '\m\\A,F('m) 



]\A,P{rn)'^P{in) ^ ~ '^'[m\\A,P{m)\Pim)"P("i)\P{m) 
= + (I - A)[,„^]\^^p(„,)\p(^)«p(m)\P(m)- 

However, P(m) \P(m) Q[m]\A and thus (^- ^^^^^^^^^ is a subma- 
trix of (/ — [m]\A' '^hich is a full rank matrix as it is upper triangular 

with ones on the diagonal. Therefore, (/ — A)~\, , ^, , is full rank, and 

^ ' ^ '[m]\A,P(m)\P{m) ' 

^° '^P{m)\P{m) ~ follows that = 0, which is a contradiction. We con- 
clude that ap(„) 7^ and, by Lemma 2, that (pCw injective. □ 
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Proposition 4. Let G = {V, D, B) he an acyclic mixed graph with topo- 
logically ordered vertex set V = [m + 1], such that the parametrization (pc is 
not injective. If either the directed part ofG does not contain an arborescence 
converging to m+1, or the bidirected part of G is not connected, then there 
is some proper induced subgraph Gw of G for which the parametrization 
4>Gw i^ ™^ injective. 

Proof. From Lemma 2, for some i < m, A G M.^ and $7 G PD{B), 
(5-1) rank(0[i]\5(i),[i](/- A)-ip(.p < \P{i)\. 

Suppose i < m. Take W = [« + 1], and denote the induced subgraph as Gw = 
{W, D, B). It holds triviahy that A := A[j+i]jj+i] G Mp and VL := G 
PD{B), and furthermore (/ — A)~^ = (I — [i+i] - then clear that, 

by Lemma 2, (pGw not injective. 

Next suppose instead that (5.1) is true for i = m. If the directed part of G 
does not contain an arborescence converging to m+1, then apply Proposi- 
tion 2 to produce a proper induced subgraph Gw with (pCw noninjective. If 
instead the bidirected part of G is not connected, then apply Proposition 3 
to produce a proper induced subgraph Gw with (pGw noninjective. 

In all cases, we have constructed a subset W CV with (pCw injective. 

□ 

6. Proofs of lemmas in Section 5. 

Lemma 6. Let G, A, $7, a, and A be as in the statement of Proposition 3. 
Then (I — A)7\-, . „, >a = 0. 

\ ' [m\\A,P{m) 

Proof. If i G [m] \ A and j G A, then, by definition of A, it holds that 
J = 0. Therefore, ^[m]\A,A = and we obtain that 

^[m]\A,[m]\A{I " ^)H\A,P(m)« = ^[m]\A,[m]{I " ^)[J,],P(m)^ = 0- 

For the last equality, observe that [m] \Ac [m] \ S{i) since S{i) C A. Since 
^[m]\A,[m]\A is positive definite, the claim follows. □ 

For a directed path vr in the graph G, we write tt ^ Ga to indicate that 
not all the nodes of vr lie in A. Also, by convention, V{j,j) is a singleton set 
containing the trivial path at j; in this case tt has no edges and we define 

Lemma 7. Let G, A, Q, a, and A be as in the statement of Proposition 3. 
Then for every i <m, 
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Proof. First, we prove the claim for i ^ A. Working from Lemma 6, we 
have that 

fcgP(m.) 

(6.1) 

= 5^ «M n 

Since i ^ A, any path vr G Vii^k) for any k necessarily satisfies vr ^ Ga- 
Hence, we can rewrite (6.1) as 

E E n ^«^)=o- 

keP(m) ^TvG'P(i,k),TT(^GAO-^b£iT ^ 

Next, we address the case A. Inducting on i in decreasing order, we 
may assume that the claim holds for all j G {i + 1, i + 2, . . . , m}. [As a base 
case, we can set i = m because, by the assumed topological order, P(m, A;) = 
for all nodes k < m.] The quantity claimed to be vanishing is 



(6.2) 



E «M E n 

keP{m) ^iT€V(i,k),TT<^GAa-'^ben 

"EoJeC e 



A. 



n ^ 



ah 



This last equality is obtained by splitting any path vr = i — > ^ • • • ^ 
Vn ^ k into i ^ j := vi and vr' = j —)• t>2 —)•••• ^ fn fc- (Note that the 
path of length zero at i is not in the sum, since this path would not satisfy 
vr Ga-) Since we assume i (z A, it holds that vr ^ Ga if and only if vr' (t Ga- 
Interchanging the order of the summations in (6.2), we obtain that 



E «M E Yl ^ab 



^ E E 

j : '-k£P{m) 



E '^^^ n 

n'£V{j,k),n'g^GA a-^ben' 



= E E E n 

^keP{m) ^n' £V{j,k),iT' (;tGA a^ben' ^ 

Working with a topologically ordered set of nodes, the presence of an edge 
i — > J implies i < j. The inductive hypothesis thus yields that 



E 

fcGP(m) 



E n Aa6)= E A., -0 = 



7r6'P('t,/t),7r(Z:G,4 a^bGTr 



which completes the inductive step and the proof of the lemma. □ 
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Lemma 8. Let G, A, Q, a and A be as in the statement of Proposition 3. 
Then for all i ^ A, 

(^-A)rL ,ap. X = (/ - A)"L .a. 
Proof. The right-hand side of the above equation can be rewritten as 

ii-^xkmr= E (^-A)->.= E "'^f E n 

= E «4 E n 

k£P{m) VG'P(i,fc),7rCGA a^-'jGTr 



+ E «M E n 

k£P{m) ^iTeV{i,k),n<;tGAa^ben 

Consider the two sums in the last line above. By Lemma 7, the second sum is 
equal to zero. Note also that if fc G P{m)\A, then there is no path vr € V^i, k) 
with TT C Ga- Therefore, the first sum can be indexed over k S P[m). We 
thus obtain that, as claimed, 

(^-^)^^P(m)"= E E n ^-b) 



k^Pim) iT&V{i,k),TT!ZGAa.^b&iT 

\,P(m)"^P{rr^y 



keP{m) 



□ 



7. Cyclic models. In this section, we prove Theorem 1 from the 
Introduction, which states that only acyclic mixed graphs may yield globally 
identifiable models. By Lemma 1, the theorem holds if we can show that the 
parametrization is not injective when G is a simple directed cycle, that 
is, when G is isomorphic to the cycle 

(7.1) 1^2^ >m^l 

for some m > 3. This noninjectivity is shown in the next lemma. Recall the 
definition of a fiber in (2.1). 

Lemma 9. Let G = {V, D, B) he a simple directed cycle on m>3 nodes, 
A G and O G PD{B). Then the cardinality of the fiber J^(A, O) is at 
most two and is equal to two for generic choices of A and fi. 

In order to prepare the proof of Lemma 9, note that for directed graphs 
the set PD{B) = PD{0) contains exactly the diagonal matrices with posi- 
tive diagonal entries. This set being invariant under matrix inversion, it is 
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convenient to consider the polynomial map 

KG ■■ (A, A) ^ (/ - A)A(I - A)^ 

that parametrizes the inverse of the covariance matrix of the distributions in 
the structural equation model. Since kg(A, A) = (/)g(A, A"^)"-*^ for A € 
and A € PD{0), the fibers of kq and (j)G are in bijection with each other. 

Proof of Lemma 9. Without loss of generality, assume G to be the 
graph with the edges in (7.1). For shorter notation, we let Aj = Aj^j+i, the 
parameter on the edge i + Throughout, indices are read cyclically with 
m + i := i for i>l. The matrix (/ — A) is invertible if and only if ]Xh=i 1- 
Let 5i = An, the inverse of the positive variance parameter associated with 
node i. Treating kq as a function of a pair of vectors (A, (5) € x W^, we 
obtain that kg{X,8) is equal to 



/ 61 + 62x1 - 


-62X1 





-6iX^ \ 




— 52^1 62 


+ 63X1 


-S3X2 










-63X2 


63 + 64X1 ■ ■ ■ 
















6m-l + 6mX'^_i —6mXm-l 




V —6lXm. 








— 6mXm-l 6m + 6iX'^/ 


Fix a pair (A^, 


(5°) el 


" X with 


]^ A? ^ 1. We wish to describe 


the 



fiber 



(7.2) {(A, 6) G X : kg{X, 6) = kg(A", 5")}. 

Let := kg{X^ ,6^). The equation kg{X,6) = determining membership 
in the fiber amounts to the system of the 2m polynomial equations 

(7.3a.i) 6^ + 6i+iXl = Kl,, 

(7.3b.i) -5,+iA, = ir5+i 

for i = 1, . . . , m. We split the problem into two cases, for which the algebraic 
degree of the equation system given by (7.3a.i) and (7.3b. z) differs. 

Case (i): Suppose A^ = for some i. Without loss of generality, A? = such 
that = and K^i = d^. As a consequence, the two equations (7.3a.z) and 
(7.3b. i) for i = 1 reduce to 5i = 61 and Ai = = A?. This provides the basis 
for solving the remaining equations recursively in the order i = m, . . . ,2. 
Each time the equation pair reduces to the linear equations 6i = 6^ and 
Aj = A^, and the fiber in (7.2) is seen to be the singleton {(A", (5^^)}. Note 
that the problem has become the same as parameter identification in the 
model based on the acyclic graph obtained by removing the edge 1—7-2 
from G. Note further that the equation system is of degree one in this case. 
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Case (ii): Assume now that A-* 7^ for all i. We claim that the fiber in 
(7.2) then also contains the pair (A^,(5^) that has coordinates 

^,^^0+ 017^1 ^^)[n;Li((A°)^-i)] 



det(J^oj 



^1 ^ 



for i = 1,2, . . . ,m. Here K^- is the matrix obtained from by removing the 
ith row and column. Note that A^) / (5°, A^) if and only if H^i A° / -1; 
recall that the product is assumed to be different from 1 to ensure that / — A 
is invertible. It is not very difficult to check that ((^^,A^) is indeed in the 
fiber; the m equations in (7.3b.«) are satisfied trivially, and the m equations 
in (7.3a.z) can be checked by plug-in. For this an explicit expression of 
det{K^-) in terms of {X^,6^) is needed. Using the Cauchy-Binet formula, 
one can show that 

j=l / \ « j=l i k=j j=i+l k=j / 

We furthermore claim that the fiber contains no points other than (A'', (5*^) 
and (A^,(5^). We outline the proof of this claim, again leaving out some of 
the details. 

Solve for Ai in equation (7.3b.i) for i = 1 and plug the resulting expression 
in 82 into the equation (7.3a.i) for i = 1. This equation can be solved for 
62 to give an expression in 5i. Continue on in this fashion for the indices 
i = 2, . . . ,m always obtaining an expression in 5i after solving (7.3a.i). Let 
[j : k] := {j, . . . ,k} for integers j < k. We find that, after the zth step, 

^2 ^^t(^°i:i-2],[l:i-2])-det(Kp2:i-2],[2:i-2])'^l 



det(i^[l.,_i],[i.,_i])-det(K0:.-i],[2:.-i])'^ 



where we define det(K[}^o]) = det(i^[° : 1]) = ^ ^nd det(/C[° :0]) = 0- ^^^t 
step of this procedure, namely, plugging the expression for Sm into the equa- 
tion (7.3a.i) for i = m produces a rational equation in the single variable 5i. 
Clearing denominators we obtain a quadratic equation in Si whose lead- 
ing coefficient for simplifies to det{K^^) and thus is nonzero. Therefore, 
the polynomial equation system in (7.3a.i)-(7.3b.z) has degree two and the 
fiber in (7.2) contains precisely (A", (5*^) and (A^,5"^). Note that the fiber has 
cardinality one (with a point of multiplicity two) if Iljli A^ = ~1- ^ 
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8. Conclusion. Our Theorems 1 and 2 fully characterize the mixed graphs 
for which the associated linear structural equation model is globally iden- 
tifiable. Globally identifiable models have smooth manifolds as parameter 
spaces, which implies in particular that maximum likelihood estimators are 
asymptotically normal for all choices of a true distribution in the model. 
Similarly, likelihood ratio statistics for testing two nested globally identi- 
fiable models are asymptotically chi-square. Example 1 demonstrates that 
these properties may fail in models that are only generically identifiable. The 
resulting inferential issues are also not so easily overcome using bootstrap 
methods; compare [1]. Nevertheless, generically identifiable models appear in 
various applications, and characterizing the mixed graphs that yield gener- 
ically identifiable linear structural equation models remains an important 
open problem. 

Acknowledgments. We are grateful to two referees and an associate ed- 
itor who provided very helpful comments on the original version of this 
paper. 
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